Text Categorization – A Review

نویسندگان

  • Rajni Jindal
  • Shweta Taneja
چکیده

With the growth of internet, the amount of digital information is growing exponentially day by day. This information may be structured or unstructured in nature. So, a need to convert unstructured text into structured text and to infer knowledge was felt As a result of this, the field of text mining emerged. Text documents may be in the form of online news articles, emails, scientific documents or reports. Text Classification or Categorization is the process of assigning of free occurring text documents into some predefined categories based on the nature and characteristics of the document. A number of statistical and machine learning techniques or classifiers are applied to text categorization .To summarize the work, in this paper, we give a survey of state of art in text categorization, different classifiers available and applications of text categorization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Systematic Review of Banking Business Models with an Approach to Sustainable Development

Modern banks have shifted their function as purely administrative, economic and industrial entities into socio-political institutions that must be sensitive to the surrounding environment. This function has always been neglected. This study was conducted based on primary, secondary, and tertiary data and reviews the full text of 75 studies selected from more than 245 studies. The selected elect...

متن کامل

Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA

With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...

متن کامل

Text based categorization predictions from book reviews

Automatically categorization is of great significance nowadays where massive data flows to us every day, especially for companies receive and need big data. It is especially important when there are some data unclassified or unlabeled, how can we define the categorization is very challenging. What we do is to categorize a review text into some pre-defined categories based on its text features. ...

متن کامل

A Review on Various Text Mining Techniques and Algorithms

Text mining is the method of extracting meaningful information or knowledge or patterns from the available text documents from various sources. The pattern discovery from the text and document organization of document is a well-known problem in data mining. At present world, the amount of stored information has been enormously increasing day by day which is generally in the unstructured form an...

متن کامل

A Review on Categorization of Text Data Using Side Information

In today’s digital environment, text databases are rapidly increases due to use of internet and communication mediums. Different text mining techniques are used for knowledge discovery and Information retrieval. Text data contains the side information along with the text data. Side information may be the metadata associated with text data like author, co-author or citation network, document pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013